Re-Ranking Approach of Spoken Term Detection Using Conditional Random Fields-Based Triphone Detection

نویسندگان

Naoki Sawada

Hiromitsu Nishizaki

چکیده

This study proposes a two-pass spoken term detection (STD) method. The first pass uses a phoneme-based dynamic time warping (DTW)-based STD, and the second pass recomputes detection scores produced by the first pass using conditional random fields (CRF)-based triphone detectors. In the second-pass, we treat STD as a sequence labeling problem. We use CRF-based triphone detection models based on features generated from multiple types of phoneme-based transcriptions. The models train recognition error patterns such as phoneme-to-phoneme confusions in the CRF framework. Consequently, the models can detect a triphone comprising a query term with a detection probability. In the experimental evaluation of two types of test collections, the CRF-based approach worked well in the re-ranking process for the DTW-based detections. CRF-based re-ranking showed 2.1% and 2.0% absolute improvements in F-measure for each of the two test collections. key words: conditional random fields, phoneme-to-phoneme confusion learning, re-ranking, spoken term detection, triphone detection

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combination of DTW-based and CRF-based Spoken Term Detection on the NTCIR-11 SpokenQuery&Doc SQ-STD Subtask

Conventional spoken term detection (STD) techniques, which use a text-based matching approach based on automatic speech recognition (ASR) systems, are not robust for speech recognition errors. This paper proposes a conditional random fields (CRF)-based combination (re-ranking) approach, which recomputes detection scores produced by a phonemebased dynamic time warping (DTW) STD approach. In the ...

متن کامل

An IWAPU STD System for OOV Query Terms and Spoken Queries

We have been proposing a Spoken Term Detection (STD) method for Out-Of-Vocabulary (OOV) query terms integrating various subword recognition results using monophone, triphone, demiphone, one third phone, and Sub-phonetic segment (SPS) models[1][2]. In this paper, we describe two methods for text OOV query terms and spoken queries. For text OOV query terms, we introduce four unique methods. First...

متن کامل

Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages

Acoustic feature similarity between search results has been shown to be very helpful for the task of spoken term detection (STD). A graph-based re-ranking approach for STD has been proposed based on the concept that search results, which are acoustically similar to other results with higher confidence scores, should have higher scores themselves. In this approach, the similarity between all sea...

متن کامل

Spoken question answering using tree-structured conditional random fields and two-layer random walk

In this paper, we consider a spoken question answering (QA) task, in which the questions are in form of speech, while the knowledge source for answers are the webpages (in text) over the Internet to be accessed by an information retrieval engine, and we mainly focus on query formulation and re-ranking part. Because the recognition results for the spoken questions are less reliable, we use N-bes...

متن کامل

Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection

Triphone acoustic models are often used as subword models for detecting out-of-vocabulary query terms in Spoken Term Detection (STD) systems. Our preliminary experiments revealed that the training data for a large portion of the approximately 8,000 triphone models are insufficient. Assuming that such insufficient models deteriorate the performance of STD, this paper proposes intensive triphone ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEICE Transactions

دوره 99-D شماره

صفحات -

تاریخ انتشار 2016

Re-Ranking Approach of Spoken Term Detection Using Conditional Random Fields-Based Triphone Detection

نویسندگان

چکیده

منابع مشابه

Combination of DTW-based and CRF-based Spoken Term Detection on the NTCIR-11 SpokenQuery&Doc SQ-STD Subtask

An IWAPU STD System for OOV Query Terms and Spoken Queries

Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages

Spoken question answering using tree-structured conditional random fields and two-layer random walk

Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection

عنوان ژورنال:

اشتراک گذاری